sequential learning
Sequential Learning of the Pareto Front for Multi-objective Bandits
Crépon, Elise, Garivier, Aurélien, Koolen, Wouter M
We study the problem of sequential learning of the Pareto front in multi-objective multi-armed bandits. An agent is faced with K possible arms to pull. At each turn she picks one, and receives a vector-valued reward. When she thinks she has enough information to identify the Pareto front of the different arm means, she stops the game and gives an answer. We are interested in designing algorithms such that the answer given is correct with probability at least 1-$\delta$. Our main contribution is an efficient implementation of an algorithm achieving the optimal sample complexity when the risk $\delta$ is small. With K arms in d dimensions p of which are in the Pareto set, the algorithm runs in time O(Kp^d) per round.
Reviews: Parsimonious Quantile Regression of Financial Asset Tail Dynamics via Sequential Learning
Summary This paper describes an approach to learning the dynamics of financial time series. The authors describe a parametric quantile function with four parameters (modelling location, scale, and the shapes of the left and right hand tails of the conditional distribution of returns). The time dynamics of these parameters are learned using LSTM neural network. The performance of the algorithm is compared to various GARCH-type specifications and a TQR model (which combines "traditional" quantile regression with a LTSM neural network). Strengths I enjoyed reading the paper.
Reviews: Complex Gated Recurrent Neural Networks
Summary of approach and contributions: The authors resurrect the pioneering work of Hirose on complex valued neural networks in order to provide a new RNN based on a complex valued activation/transition function and a complex argument gating mechanism. In order to obtain a differentiable function that is not constant and yet bounded, the authors step away from holomorphic functions and employ CR calculus. The authors show experimental improvements on two synthetic tasks and one actual data set. Strengths of the paper: o) Moving away from strict holomorphy and using CR calculus to apply complex valued networks to RNNs is interesting as a novel technique. I think that the authors should spend more time explaining how phases can be easily encoded in the complex domain and therefore why such complex representations can be advantageous for sequential learning.
Memory-Based Dual Gaussian Processes for Sequential Learning
Chang, Paul E., Verma, Prakhar, John, S. T., Solin, Arno, Khan, Mohammad Emtiyaz
Sequential learning with Gaussian processes (GPs) is challenging when access to past data is limited, for example, in continual and active learning. In such cases, errors can accumulate over time due to inaccuracies in the posterior, hyperparameters, and inducing points, making accurate learning challenging. Here, we present a method to keep all such errors in check using the recently proposed dual sparse variational GP. Our method enables accurate inference for generic likelihoods and improves learning by actively building and updating a memory of past data. We demonstrate its effectiveness in several applications involving Bayesian optimization, active learning, and continual learning.
Evaluating multi-class learning strategies in a generative hierarchical framework for object detection
Multiple object class learning and detection is a challenging problem due to the large number of object classes and their high visual variability. Specialized detectors usually excel in performance, while joint representations optimize sharing and reduce inference time --- but are complex to train. Conveniently, sequential learning of categories cuts down training time by transferring existing knowledge to novel classes, but cannot fully exploit the richness of shareability and might depend on ordering in learning. In hierarchical frameworks these issues have been little explored. In this paper, we show how different types of multi-class learning can be done within one generative hierarchical framework and provide a rigorous experimental analysis of various object class learning strategies as the number of classes grows.
Sequential Learning from Noisy Data: Data-Assimilation Meets Echo-State Network
This paper explores the problem of training a recurrent neural network from noisy data. While neural network based dynamic predictors perform well with noise-free training data, prediction with noisy inputs during training phase poses a significant challenge. Here a sequential training algorithm is developed for an echo-state network (ESN) by incorporating noisy observations using an ensemble Kalman filter. The resultant Kalman-trained echo-state network (KalT-ESN) outperforms the traditionally trained ESN with least square algorithm while still being computationally cheap. The proposed method is demonstrated on noisy observations from three systems: two synthetic datasets from chaotic dynamical systems and a set of real-time traffic data.
Learning over No-Preferred and Preferred Sequence of Items for Robust Recommendation
Burashnikova, Aleksandra, Maximov, Yury, Clausel, Marianne, Laclau, Charlotte, Iutzeler, Franck, Amini, Massih-Reza
In this paper, we propose a theoretically supported sequential strategy for training a large-scale Recommender System (RS) over implicit feedback, mainly in the form of clicks. The proposed approach consists in minimizing pairwise ranking loss over blocks of consecutive items constituted by a sequence of non-clicked items followed by a clicked one for each user. We present two variants of this strategy where model parameters are updated using either the momentum method or a gradient-based approach. To prevent updating the parameters for an abnormally high number of clicks over some targeted items (mainly due to bots), we introduce an upper and a lower threshold on the number of updates for each user. These thresholds are estimated over the distribution of the number of blocks in the training set. They affect the decision of RS by shifting the distribution of items that are shown to the users. Furthermore, we provide a convergence analysis of both algorithms and demonstrate their practical efficiency over six large-scale collections with respect to various ranking measures and computational time.
Sequential Learning for Domain Generalization
Li, Da, Yang, Yongxin, Song, Yi-Zhe, Hospedales, Timothy
In this paper we propose a sequential learning framework for Domain Generalization (DG), the problem of training a model that is robust to domain shift by design. Various DG approaches have been proposed with different motivating intuitions, but they typically optimize for a single step of domain generalization -- training on one set of domains and generalizing to one other. Our sequential learning is inspired by the idea lifelong learning, where accumulated experience means that learning the $n^{th}$ thing becomes easier than the $1^{st}$ thing. In DG this means encountering a sequence of domains and at each step training to maximise performance on the next domain. The performance at domain $n$ then depends on the previous $n-1$ learning problems. Thus backpropagating through the sequence means optimizing performance not just for the next domain, but all following domains. Training on all such sequences of domains provides dramatically more `practice' for a base DG learner compared to existing approaches, thus improving performance on a true testing domain. This strategy can be instantiated for different base DG algorithms, but we focus on its application to the recently proposed Meta-Learning Domain generalization (MLDG). We show that for MLDG it leads to a simple to implement and fast algorithm that provides consistent performance improvement on a variety of DG benchmarks.
Evaluating multi-class learning strategies in a generative hierarchical framework for object detection
Fidler, Sanja, Boben, Marko, Leonardis, Ales
Multiple object class learning and detection is a challenging problem due to the large number of object classes and their high visual variability. Specialized detectors usually excel in performance, while joint representations optimize sharing and reduce inference time --- but are complex to train. Conveniently, sequential learning of categories cuts down training time by transferring existing knowledge to novel classes, but cannot fully exploit the richness of shareability and might depend on ordering in learning. In hierarchical frameworks these issues have been little explored. In this paper, we show how different types of multi-class learning can be done within one generative hierarchical framework and provide a rigorous experimental analysis of various object class learning strategies as the number of classes grows.